Subband-crosscorrelation analysis for robust speech recognition
نویسندگان
چکیده
This paper describes subband-crosscorrelation (SBXCOR) analysis using two channel signals. The SBXCOR analysis is an extended signal processing technique of subband-autocorrelation (SBCOR) analysis that extracts periodicities present in speech signals. In this paper, the performance of SBXCOR is investigated using a DTW word recognizer, under simulated acoustic conditions on computer and a real environmental condition. Under the simulated condition, it is assumed that speech signals in each channel are perfectly synchronized while noises are not correlated. Consequently, the effective signal-to-noise ratio of the signal generated by simply summing the two signals is raised about 3dB. In such a case, it is shown that SBXCOR is less robust than SBCOR extracted from the twochannel-summed signal, but more robust than the conventional onechannel SBCOR. The resultant performance was much better than that of smoothed group delay spectrum and mel-frequency cepstral coefficient. In a real computer room, it is shown that SBXCOR is more robust than the two-channel-summed SBCOR.
منابع مشابه
A binaural speech processing method using subband-cross correlation analysis for noise robust recognition
This paper describes an extended subband-crosscorrelation(SBXCOR) analysis to improve the robustness against noise. The SBXCOR analysis, which has been already proposed, is a binaural speech processing technique using two input signals and extracts the periodicities associated with the inverse of the center frequency(CF) in each subband. In this paper, by taking an exponentially weighted sum of...
متن کاملSpectral subband centroid features for speech recognition
Cepstral coefficients derived either through linear prediction (LP) analysis or from filter bank are perhaps the most commonly used features in currently available speech recognition systems. In this paper, we propose spectral subband centroids as new features and use them as supplement to cepstral features for speech recognition. We show that these features have properties similar to formant f...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملA High-Dimensional Subband Speech Representation and SVM Framework for Robust Speech Recognition
This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) frontend that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996